Skip to content

poc: Extensible Mapper using JavaScript #3650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 75 commits into
base: main
Choose a base branch
from

Conversation

didier-wenzek
Copy link
Contributor

@didier-wenzek didier-wenzek commented May 28, 2025

Proposed changes

This POC explores quickJS as an alternative to:

A thin-edge mapper that can be extended by users with mapping rules implemented in JavaScript.

Features:

  • Chain stateless filters along pipelines: MQTT sub| filter-1 | filter-2 | ... | filter-n | MQTT pub
  • Configure filters
  • Update filter config using MQTT
  • Implement state-full filters (deferring messages up to the end of a time window)
  • Combine user-provided JS filters with built-in Rust filters

Criteria:

  • Expressive power?
  • Dev tooling ?
  • Performance? Memory footprint?
  • Debuggability?

Examples:

  • Add missing timestamps
  • Discard measurements with timestamp out of the expected range (too old or in the future)
  • Translate thin-edge json measurment into cumulocity json
  • Add measurement units - read from topic metadata
  • Translate Collectd data into thin-edge JSON
  • Group messages received during the same time window
  • Compute 5-minute rolling average, min, max
  • A circuit breaker filter detecting that too many messages are being sent within a given period
  • See also real use-case examples

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue


Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s. You can activate automatic signing by running just prepare-dev once)
  • I ran just format as mentioned in CODING_GUIDELINES
  • I used just check as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

Copy link

codecov bot commented May 28, 2025

Codecov Report

Attention: Patch coverage is 79.43925% with 22 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/extensions/tedge_mqtt_ext/src/lib.rs 80.23% 13 Missing and 4 partials ⚠️
crates/common/mqtt_channel/src/connection.rs 50.00% 2 Missing ⚠️
crates/core/tedge_mapper/src/lib.rs 0.00% 2 Missing ⚠️
crates/core/plugin_sm/src/operation_logs.rs 0.00% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@didier-wenzek didier-wenzek marked this pull request as draft May 28, 2025 14:21
@didier-wenzek
Copy link
Contributor Author

The javascript examples have to be updated.

For instance the syntax config.topic || "te/device/main///m/collectd" is not supported. See fbe1341

@reubenmiller
Copy link
Contributor

The javascript examples have to be updated.

For instance the syntax config.topic || "te/device/main///m/collectd" is not supported. See fbe1341

Though it seems that Optional Chaining is part of ES2020 (which means it should be supported by QuickJS, I confirmed on the WASM version of QuickJS at least), so that would be the most similar to your previous syntax (just adding a ? before the .).

For example:

let topic = config?.topic || "te/device/main///m/collectd";

Copy link
Contributor

@albinsuresh albinsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my first round of review comments focussing purely on the pipeline APIs and not much of the rust infra code. Unfortunately, I'm having some trouble testing the feature due to some certificate misconfiguration preventing the mapper from even starting with the following error:

Jun 10 06:36:23 2a9636a9acc5 tedge-mapper[3107]: 2025-06-10T06:36:23.630655575Z ERROR Runtime: Actor MQTT-2 has finished unsuccessfully: ActorError(InvalidPrivateKey(InvalidCertificate(Other(OtherError(UnsupportedCertVersion)))))

Will provide additional usability feedback once that issue is resolved.

stages = [
{ filter = "add_timestamp.js" },
{ filter = "drop_stragglers.js", config = { max_delay = 60 } },
{ filter = "te_to_c8y.js", meta_topics = ["te/+/+/+/+/m/+/meta"] }
Copy link
Contributor

@albinsuresh albinsuresh Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if stage-wise meta_topics is really necessary or a pipe-line level meta_topics would be sufficient (and simpler to maintain).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want here is for the messages received on the meta topics to be directly delivered to the stage. The other stages have no way to do something meaningful with these messages (In this case, units for various measurement types).

A better name might be config_topics.

stages = [
{ filter = "add_timestamp.js" },
{ filter = "drop_stragglers.js", config = { max_delay = 60 } },
{ filter = "te_to_c8y.js", meta_topics = ["te/+/+/+/+/m/+/meta"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expressing the correlation between the input_topic and meta_topic with wildcards appears slightly inaccurate/misleading. For example, when the above definition is applied on a te/+/+/+/+/m/temperature measurement, the corresponding meta_topic should have only been te/+/+/+/+/m/temperature/meta. But with the wildcard te/+/+/+/+/m/+/meta subscription, it covers the meta topics of other measurements like m/pressure/meta, m/humidity/meta etc as well. Wondering if we should introduce some sort of variable substitution syntax like m/{m_type} and m/{m_type}/meta so that this correlation is better expressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you propose seems too specific to this specific example (adding units to measurements leveraging the meta topics as defined by thin-edge). Sure the script (here te_to_c8y.js) has to appropriately match data and meta data, but why should the generic mapper enforce that from the outside?

@albinsuresh
Copy link
Contributor

Unfortunately, I'm having some trouble testing the feature due to some certificate misconfiguration preventing the mapper from even starting with the following error:

Jun 10 06:36:23 2a9636a9acc5 tedge-mapper[3107]: 2025-06-10T06:36:23.630655575Z ERROR Runtime: Actor MQTT-2 has finished unsuccessfully: ActorError(InvalidPrivateKey(InvalidCertificate(Other(OtherError(UnsupportedCertVersion)))))

Will provide additional usability feedback once that issue is resolved.

Though I couldn't fix the cert issue, I've moved on by completely disabling auth settings.


stages = [
{ filter = "add_timestamp.js" },
{ filter = "circuit-breaker.js", tick_every_seconds = 1, config = { stats_topic = "te/error", too_many = 10000, message_on_too_many = { topic = "te/device/main///a/too-many-messages", payload = "too many messages" }, message_on_back_to_normal = { topic = "te/device/main///a/too-many-messages", payload = "back to normal" } } }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line highlights pain points with the current configuration format.

  • There are two levels of config which is confusing. tick_every_second is used by the engine, while config.too_many is passed to the script. Flattening the config might be simpler.
  • TOML inline tables are intended to appear on a single line. Leading to painfully long lines as here.
  • Passing a JSON payload to the filter config is not straightforward. Currently the engine expects a string and the JS filter has to stringyfy the payload.

@didier-wenzek didier-wenzek force-pushed the poc/tedge-quickjs-mapper branch from fba7bdd to 2e41d5e Compare June 24, 2025 18:38
@didier-wenzek didier-wenzek force-pushed the poc/tedge-quickjs-mapper branch from 2e41d5e to e759966 Compare June 24, 2025 18:45
@didier-wenzek didier-wenzek force-pushed the poc/tedge-quickjs-mapper branch from 5d6a90a to f2ca7b5 Compare July 22, 2025 19:55
@didier-wenzek didier-wenzek temporarily deployed to Test Pull Request July 22, 2025 19:55 — with GitHub Actions Inactive
For now, the bindgen feature is enable for all targets.
One could consider to use bindings shipped with quickjs when available.
See https://github.com/delskayn/rquickjs?tab=readme-ov-file#supported-platforms

Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
@didier-wenzek didier-wenzek temporarily deployed to Test Pull Request July 23, 2025 07:39 — with GitHub Actions Inactive
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants